Robust speech recognition by modifying clean and telephone feature vectors using bidirectional neural network

نویسندگان

  • Mansoor Vali
  • Seyyed Ali Seyyed Salehi
  • Kazem Karimi
چکیده

In this paper we present a new method for nonlinear compensation of distortions, e.g. channel effects and additive noise, in clean and telephone speech recognition. A Bidirectional Neural Network (BidiNN) was developed and implemented in order to modify distorted input feature vectors and improve the overall recognition accuracy. Distorted components in feature vectors were estimated in accordance with the latent knowledge in the hidden layer of the neural network. This knowledge is obtained by training with clean and telephone speech, simultaneously and is mostly induced by phonemic content and less influenced by the irrelevant variations in speech signal. An MLP neural network was trained with these modified feature vectors. Comparing the achieved results with a reference model that was trained with unmodified feature vectors, demonstrate significant improvement in clean and telephone speech recognition accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

Bidirectional Neural Network for Feature Compensation of Clean and Telephone Speech Signals

In this paper, we continue our previous work on nonlinear feature compensation of distortions in clean and telephone speech recognition systems. We have shown that Bidirectional Neural Network (Bidi-NN) can compensate nonlinearly-distorted components of feature vectors. In this study, we present a new effort to improve recognition accuracy on clean and telephone speech data by employing a two-s...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006